PROTOGENE: turning amino acid alignments into bona fide CDS nucleotide alignments

نویسندگان

  • Sébastien Moretti
  • Frédéric Reinier
  • Olivier Poirot
  • Fabrice Armougom
  • Stéphane Audic
  • Vladimir Keduas
  • Cédric Notredame
چکیده

We describe Protogene, a server that can turn a protein multiple sequence alignment into the equivalent alignment of the original gene coding DNA. Protogene relies on a pipeline where every initial protein sequence is BLASTed against RefSeq or NR. The annotation associated with potential matches is used to identify the gene sequence. This gene sequence is then aligned with the query protein using Exonerate in order to extract a coding nucleotide sequence matching the original protein. Protogene can handle protein fragments and will return every CDS coding for a given protein, even if they occur in different genomes. Protogene is available from http://www.tcoffee.org/.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TranslatorX: multiple alignment of nucleotide sequences guided by amino acid translations

We present TranslatorX, a web server designed to align protein-coding nucleotide sequences based on their corresponding amino acid translations. Many comparisons between biological sequences (nucleic acids and proteins) involve the construction of multiple alignments. Alignments represent a statement regarding the homology between individual nucleotides or amino acids within homologous genes. A...

متن کامل

White spot syndrome virus Orf514 encodes a bona fide DNA polymerase.

White spot syndrome virus (WSSV) is the causative agent of white spot syndrome, one of the most devastating diseases in shrimp aquaculture. The genome of WSSV includes a gene that encodes a putative family B DNA polymerase (ORF514), which is 16% identical in amino acid sequence to the Herpes virus 1 DNA polymerase. The aim of this work was to demonstrate the activity of the WSSV ORF514-encoded ...

متن کامل

Genome bias influences amino acid choices: analysis of amino acid substitution and re-compilation of substitution matrices exclusive to an AT-biased genome

The genomic era has seen a remarkable increase in the number of genomes being sequenced and annotated. Nonetheless, annotation remains a serious challenge for compositionally biased genomes. For the preliminary annotation, popular nucleotide and protein comparison methods such as BLAST are widely employed. These methods make use of matrices to score alignments such as the amino acid substitutio...

متن کامل

Improving spliced alignment for identification of ortholog groups and multiple CDS alignment

The Spliced Alignment Problem (SAP) that consists in finding an optimal semi-global alignment of a spliced RNA sequence on an unspliced genomic sequence has been largely considered for the prediction and the annotation of gene structures in genomes. Here, we re-visit it for the purpose of identifying CDS ortholog groups within a set of CDS from homologous genes and for computing multiple CDS al...

متن کامل

Calculating site-specific evolutionary rates at the amino-acid or codon level yields similar rate estimates

Site-specific evolutionary rates can be estimated from codon sequences or from amino-acid sequences. For codon sequences, the most popular methods use some variation of the dN∕dS ratio. For amino-acid sequences, one widely-used method is called Rate4Site, and it assigns a relative conservation score to each site in an alignment. How site-wise dN∕dS values relate to Rate4Site scores is not known...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Nucleic Acids Research

دوره 34  شماره 

صفحات  -

تاریخ انتشار 2006